Skip to main content

Web Collaborations

The Internet Archive is working to prevent the Internet - a new medium with major historical significance - and other "born-digital" materials from disappearing into the past. Collaborating with institutions including the Library of Congress and the Smithsonian, we are working to preserve a record for generations to come.

Smithsonian Institution's 1996 US Election Display

Page from a snapshot of the Web, now in the Smithsonian A display at the Smithsonian Institution shows how presidential candidates and parties first used the Web. The display includes 1996 campaign pages for five political parties — as well as pages such as the "Steve Forbes Official Home Page" and the "Official Internet Headquarters of the [Pat] Buchanan Brigade," which were captured before some candidates dropped out of the race and scaled back or shut down their sites.

The display also includes pages from the Federal Election Commission site with financial information about candidates, parties, and political action committees.


World Wide Web 1997: 2 Terabytes in 63 Inches

Sculpture of 1997 Web snapshot in the lobby of the Library of Congress

What would a snapshot of the Web look like? Visitors passing through the lobby of the Library of Congress get the picture when they see a sculpture — a stack of computer screens and tapes housing a snapshot of the Web in early 1997 — by Alan Rath. The Internet Archive is proud to have part of its collections in the Library of Congress.

Data gift of Alexa Internet

Watch a demo of the sculpture


Xerox PARC Research Projects

"It Grows on Its Own Like an Ecosystem"

The Internet Ecologies Area at Xerox’s Palo Alto Research Center is using multiple snapshots from the Internet Archive on disk — "the Web in a box" — as a kind of test tube for understanding the Web. "We see the Web as an ‘information ecology,’ where we study the relationships between people and information," says PARC researcher Jim Pitkow.

PARC "benefited greatly" from access to the Archive’s crawls, says Pitkow’s colleague and Stanford physics professor Bernardo Huberman. According to Pitkow, access to the snapshots "is great for researchers because it lets them fuse traditional tools and techniques with new tools that haven’t existed before."

Huberman describes a PARC study that produced a mathematical "law of surfing," which says that Web traffic follows predictable, regular patterns. For example, in a manifestation of the "winner take all" principle, it turns out that just a few Web sites get most of the traffic. The researchers were also able to show how deeply people delve into a typical Web site: on average, it’s about a page and a half. Huberman has also studied Internet congestion as a social dilemma, where people weigh the costs and benefits of putting up with slow traffic versus waiting until the network is less crowded.

In a study of the topology of the Web, a Stanford graduate student working on PARC’s Internet ecology project found that any two Web sites are no more than four clicks away from each other — hard evidence that the world is smaller than it seems, on the Web at least.

Research on this scale and of this complexity makes new thinking possible in a whole range of fields, from graph theory to sociology. Pitkow compares what’s happening to the Einstein-era thrust past the limitations of Newtonian physics into quantum mechanics: "The Web," he says, "requires a whole new form of understanding."

Wayback Machine Forum (closed) email rss RSS

Subject Poster Replies Date
This forum is closed Jeff Kaplan 0
URL archived in Sept 12, 2004, but can't be accessed KTRMAmbiance 0
Google site archive keeps reloading itself in an infinite loop Stargate38 0
CDX Server not working PDXMatt 1
   Re: CDX Server not working PDXMatt 0
Deviantart Needs Better Archiving AnonymousJohnGrant 0
Accessing videos from YouTube in 2007 Reginald Johnson 0
Saving now requires login??? Death of the Internet Archive 0
Removal of site from wayback machine Emilyx97 1
   Re: Removal of site from wayback machine idnull 0
Archived ZIP file only downloads HTML script instead of ZIP file. WareNetwork2000 0
Archive Showing Raw Code Glitchdemall 0
Video stuck/won't play jarrod1997 0
Can't capture websites by Finnish government petskuu 0
Deviantart archives disappearing Olaf53 0
Favor agregar Aprender Todo sitio Web chricho 0
How to Delete Item herelol 0
demand removal of my domain PCKING0154 0
Suggestion: "back to forums" button. TL7 1
   Re: Suggestion: 'back to forums' button. APIFILES 0
Trivial question: Wayback excludes Quora.com despite already blocked by robots.txt? TechLord 0
Why are Goggle.com and Goggle.org excluded from Wayback? TechLord 0
"Wayback Machine doesn't have that page archived" - diacritic character issue? DaoYang 1
   Re: 'Wayback Machine doesn't have that page archived' - diacritic character issue? Swnairex 1
     Re: 'Wayback Machine doesn't have that page archived' - diacritic character issue? DaoYang 1
       Re: 'Wayback Machine doesn't have that page archived' - diacritic character issue? cverwalter 0
wayback machine get number of captures chaymag 0
No Images sste 0
Not Found The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.(Error) W*E*R*D*N*A 1
   Re: Not Found The requested URL was not found... DoomTay 0
Let's say I have old files that are no longer available at Wayback Machine, can the Wayback Machine put files back into the archive? ExoticLover 2
   Re: Let's say I have old files that are no longer available at Wayback Machine, can the Wayback Machine put files back into the archive? Jeff Kaplan 0
   Re: Let's say I have old files that are no longer available at Wayback Machine, can the Wayback Machine put files back into the archive? Lilianamillar 0
Archieve of my post about komedo hadingrh 1
   Re: Archieve of my post about komedo hadingrh 0
how i can find delete youtube websites for Debelah Morgan? alexclarke231 0
Wayback Machine has a 504 error message! angeldeb82 0
Site missing Crawls it once had. NCWLNickGemini 1
   Re: Site missing Crawls it once had. MeditateOrDie 1
     Re: Site missing Crawls it once had. NCWLNickGemini 1
       Re: Site missing Crawls it once had. MeditateOrDie 1
         Re: Site missing Crawls it once had. ajacraig2011 0
Wayback Machine not archiving site properly? DanTheMan827 0
How do set Wayback Machine to archive my website Govt Career Updates 1
   Re: How do set Wayback Machine to archive my website MeditateOrDie 0
404 - Redir question devuser 0
Why does some censorship exist? Animedude5555 0
Please add Electric Furnace Mrout 0
Retrieve photo's old Hyves profile stehof 0
test only SeaDoo 0
Site specific search options? JaneLeia 0
Site Removal Please MGMidget1234 0
Site Removal Request 4687431212 1
   Re: Site Removal Request 4687431212 0
Takedown request victorlsxiv 0
only two hours left of April20 (420): everybody Wayback cannabis homepages EarthFurst 2
   Re: only two hours left of April20 (420): everybody Wayback cannabis homepages EarthFurst 0
   Re: only two hours left of April20 (420): everybody Wayback cannabis homepages EarthFurst 0
"archived" pages disappearing from Wayback: reference at archive.is EarthFurst 1
   Re: 'archived' pages disappearing from Wayback: reference at archive.is Jeff Kaplan 1
     Re: 'archived' pages disappearing from Wayback: reference at archive.is EarthFurst 1
       Re: 'archived' pages disappearing from Wayback: reference at archive.is Jeff Kaplan 1
         Re: 'archived' pages disappearing from Wayback: reference at archive.is systemsplanet 0
The Wayback Machine Forum is "(closed)", but nothing will stop me from adding this post– BELIEVE IT! pegzmasta 1
   Re: Original Archive is '(closed)' PDpolice 1
     Re: Original Archive is '(closed)' pegzmasta 0
Multiple Set-Cookie Headers: Wayback Philip_Reeds_Freedom 0
Hi, Wayback– Problem Solved! pegzmasta 1
   This Is Only a Test Dupenhagen Moonbat 1
     Re: This Is Only a Test pegzmasta 0
how to query for all the websites that end in ".com.br"? LucasMation 1
   Re: how to query for all the websites that end in '.com.br'? pegzmasta 1
     Re: how to query for all the websites that end in '.com.br'? LucasMation 1
       Re: how to query for all the websites that end in '.com.br'? pegzmasta 1
         Re: how to query for all the websites that end in '.com.br'? sahil7459 0
Challenge: Read, Reply, and Correct! [The Internet Archive is tasked with preserving content on the Internet, but will it preserve and fix it's own forums?] pegzmasta 0
How long does it take to get a response from info@archive.org? juwhyonee 1
   Re: How long does it take to get a response from info@archive.org? aanon 0
problem with waybacks of comicbookresources.com homepage after 2013 EarthFurst 0
my website is not archiving jon617 0
So does excluding via robots actually delete or not? talkingnewspapers 0
Crawl and archive a whole website recursively maltris 1
   Re: Crawl and archive a whole website recursively B4CK and F0RTH 0
My Website Is Not Crawled Despite Removing Restrictions From Robots.txt leodwight 0
What is the algorithm for deciding when to not crawl a page anymore? zwol 0
End of an era: Imageshack deletes free accounts Javik 0
Wayback machine rebuild suggestions Archive Lover1 1
   Re: Wayback machine rebuild suggestions h891322 0
Entire website archival tycio 0
Late 2007 Archive... Gone? PeabodySam 0
How do I retrieve the original form of a page from the Wayback Machine? zwol 1
   Re: How do I retrieve the original form of a page from the Wayback Machine? DKL3 2
     Re: How do I retrieve the original form of a page from the Wayback Machine? zwol 0
     Re: How do I retrieve the original form of a page from the Wayback Machine? slowride13 1
       Re: How do I retrieve the original form of a page from the Wayback Machine? Samuel Bronson 0
Cannot see content on website but could see before ? Izzy15 1
   Re: Cannot see content on website but could see before ? slowride13 1
     Re: Cannot see content on website but could see before ? Izzy15 0
Cannot see content on website but could see before ? Izzy15 0
searching url substring iaw4 0
Help accessing a site that's been robot.txted? DiamondBlade11 0
Wayback snapshot deleted? Junction10 0
please delete these two files ahmad_accounts1 1
   Re: please delete these two files Jeff Kaplan 0
Your Wayback Machine is getting very slow angeldeb82 0

View more forum posts